Random sampling from a search engine's index

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Random Sampling from a Search Engine’s Corpus

We revisit a problem introduced by Bharat and Broder almost a decade ago: how to sample random pages from the corpus of documents indexed by a search engine, using only the search engine’s public interface? Such a primitive is particularly useful in creating objective benchmarks for search engines. The technique of Bharat and Broder suffers from a well-recorded bias: it favors long documents. I...

متن کامل

Random mappings designed for commercial search engines

We give a practical random mapping that takes any set of documents represented as vectors inEuclidean space and then maps them to a sparse subset of the Hamming cube while retaining ordering ofinter-vector inner products. Once represented in the sparse space, it is natural to index documents usingcommercial text-based search engines which are specialized to take advantage of thi...

متن کامل

3D Inverted Index with Cache Sharing for Web Search Engines

Web search engines achieve efficient performance by partitioning and replicating the indexing data structure used to support query processing. Current practice simply partitions and replicates the text collection on the set of cluster processors and then constructs in each processor an index data structure. This paper proposes a different approach by constructing an index data structure that pr...

متن کامل

Tree Search Stabilization by Random Sampling

We discuss the variability in the performance of multiple runs of Mixed Integer Linear solvers, and we concentrate on the one deriving from the use of different optimal bases of the Linear Programming relaxations. We propose a new algorithm exploiting more than one of those bases and we show that different versions of the algorithm can be used to stabilize and improve the performance of the sol...

متن کامل

Random Search versus Genetic Programming as Engines for Collective Adaptation

We have integrated the distributed search of genetic programming (GP) based systems with collective memory to form a collective adaptation search method. Such a system signiicantly improves search as problem complexity is increased. Since the pure GP approach does not scale well with problem complexity, a natural question is which of the two components is actually contributing to the search pro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of the ACM

سال: 2008

ISSN: 0004-5411,1557-735X

DOI: 10.1145/1411509.1411514